PALEOBOTANY, EVIDENCE, AND Kevin C. Nixon' 
MOLECULAR DATING: AN 

EXAMPLE FROM THE 

NYMPHAEALES 


ABSTRACT 


In recent years, most systematics studies have focused on phylogenetic analyses of molecular data sets. The latest trend has 
been to add molecular dating to these phylogenies utilizing methods such as nonparametric rate smoothing (NPRS) and 
penalized likelihood (PL) and calibrating these analyses using (often only one or very few) fossils. The success of such 
approaches is dependent on several assumptions. including a local clocklike behavior of evolution, the accuracy of the 
phylogeny. the correct phylogenetic placement of fossils. and the consistency of particular fossils in extrapolating rates 
throughout a given phylogenetic tree. An example of such an analysis of the Nymphaeales is provided to illustrate 
inappropriate use of fossils in this context and faulty results based on inadequate and/or inappropriate analyses. Neither fossil 
identifications nor a particular method of molecular dating should be called into question based on the disparity of a single 
analysis. Indeed, fossil observations and molecular dating are often at odds due to failure of the data to meet minimum 
assumptions of a clocklike behavior and poor or inadequate sampling of extant taxa, molecular sequence data, and/or fossils. 
Rejection or acceptance of either the fossils or the molecular dates resulting from their use should be considered in light of 
direct analysis of the fossils and compared to other analyses using other fossils and/or other extant data sets. Rejection of 
fossils based on unexpected results is merely veriſicationism. 


Key words: 


Angiosperm, molecular clock, nonparametric rate smoothing, Nymphaeales, penalized likelihood. 


In recent years, the fields of plant and animal 
systematics have progressed from a relatively subjec- 
tive and often authoritarian process of classification 
(see Cronquist, L981) to one based on monophyletic 
groupings following the recommendations of Hennig 
(1966), although proponents of paraphyletic classifi- 
vations persist (e.g., Brummit, 2006; Hörandl, 2007). 
The existence of comparative DNA sequence data for 
most major taxa now provides a basis for coordinating 
phylogenies of extant species and fossils via simulta- 
neous molecular-morphological analyses that include 
both living and extinct terminal taxa (numerous 
papers, including Sun et al., 2002). In turn, such 
combined analyses (or surrogates that attempt to place 
fossils on independently calculated phylogenies) can 
be used to estimate minimum ages of hypothetical 
ancestral nodes (Crepet et al., 2004). A recent and 
increasingly popular use of this approach is to use 
fossil calibration points to estimate evolutionary rate 
parameters for trees on the basis of molecular data 
and thus extend the dating of nodes beyond the 
immediate attachment point of the fossils (Sanderson 
et al., 2004). These methods can also be used to 
calculate statistical confidence intervals for these 
estimated ages. In the broadest sense, such analyses 
rely on what was originally referred to as the 
molecular clock—the assumption that evolutionary 


rates of particular genes are constant either globally 
(the ideal situation) or sufficiently within parts of the 
that 
minimize rate change from one node to another. This 


tree to be amenable to models attempt to 
idea, that evolution may occur at least locally in a 
clocklike manner, can be traced back at least four 
(Juckerkandl & 1962. 1965), 


although the recent discussion of this topic in the 


decades Pauling, 
context of phylogenetic analysis has shifted somewhat, 
and several issues deemed important in the original 
dialogue (such as gene neutrality; see King & Jukes, 
1969) are seldom mentioned in current discussions. In 
many ways, the idea of a molecular clock underlies all 
currently used model-based approaches such as 
maximum likelihood or Bayesian phylogenetic anal- 
ysis, which rely on predictable patterns of molecular 
change that can be designated in a model with 
relatively few parameters. Clocklike evolution is also 
the underlying justification for using neighbor-joining 
trees as an approximation of phylogenetic pattern, 
based on a model of minimum evolution and a least- 
squares—based algorithm (Saitou & Nei, 1987). 

It is not the intent of this paper to review or criticize 
various methods of node-dating using molecular trees 
calibrated with fossil data; indeed, some of the 
proponents of these methods already have provided 
relatively strong criticisms and caveats (Sanderson et 


I. H. Bailey Hortorium, Department of Plant Biology, Cornell University, Ithaca, New York 14853. U.S.A. 


KCN2@cornell.edu. 
doi: 10.34.17/2007063 


ANN. Missouri Bot. GARD. 95: 43—50. PUBLISHED ON II APRIL 2008. 


44 


Annals of the 
Missouri Botanical Garden 


al., 2004). To a great extent, the acceptance or 
rejection of molecular-dating approaches is based on 
faith in the general concept of the molecular clock and 
the accuracy and/or reliability of particular models 
that are incorporated into such methods. In many 
cases, age estimates of important nodes based on 
molecular dating are far older (and only rarely, 
younger) than minimum age estimates based on 
existing fossil evidence alone (see an excellent review 
of these issues by Graur & Martin, 2004). Of interest 
here is not whether molecular dating is accurate but, 
instead, what the relationship between these two 
approaches is and how to resolve discrepancies in 
results. The two approaches can be summarized as 
follows: (1) Estimation of the minimum ages of clades 
is based on direct observations of fossils—their 
identification in a phylogenetic context—combined 
with dating of the strata in which they occur and 
subsequent estimation of minimum ages for clades 
based on the oldest fossil record included by that 
clade (e.g., Crepet et al., 2004). This method may or 
may not involve additional extrapolation of ages based 
on branch lengths. The phylogeny on which the 
estimations are based might be molecular, morpho- 
logical, or combined, but there must be some way to 
explicitly assign fossil taxa to positions in the tree. (2) 
Estimation of ages is based on molecular-generated 
branch lengths and topologies interpreted with a 
particular algorithm (and implicit model), such as 
likelihood (PL) or 


smoothing (N PRS: Sanderson, 1997), and calibrated 


penalized nonparametric rate 
with one or more fossils identified and dated by the 
above methods. A molecular tree must be available, 
and, in general, there has been little discussion of the 
exact requirements for fossil selection and how to 
determine exactly where it should be placed in the 
tree (see Gandolfo et al., 2008). 

In order to discuss problems that arise when these 
two approaches yield different answers, Iwill use a 
particular example that involves fossils that I have 


studied directly to illustrate some of these issues. 


BackGrotuNnb: THE MOLECULAR CLOCK 


It is generally accepted that the idea of the 
molecular clock was first introduced by Zuckerkandl 
and Pauling (1962, 1005). Soon after. in the mid- to 
late 1960s. the idea of molecular clocks and the 
potential to use such clocks for constructing phyloge- 
netic trees. estimating evolutionary rates, and/or 
estimating the age of particular clades emerged. 
Because DNA sequencing was not feasible at thal 
time. most of the discussion of molecular clock tree 
construction and dating centered on (very scarce) 


protein sequence data, particularly cytochrome C. 


One major assumption of the original approach to 
molecular clocks was that neutral genes or sites, and 
what can be termed synonymous changes at these 
sites, were the most likely to have clocklike rates and 
thus were the best candidates for phylogenetic 
reconstruction. It is important to note that the concept 
of (selective) neutrality in this context excludes both 
sites that are variable (and selected for or against in 
various ways), as well as, and perhaps most im- 
portantly, sites that are conserved (e.g., stabilizing 
selection) and show little variation. A truly neutral site 
is one in which selection is absent, whether it varies 
much or little. 

At one point, the idea that most or all evolutionary 
changes are actually (or virtually) neutral also entered 
the debate. This neutral theory—the idea that most 
evolutionary change is non-Darwinian—is closely tied 
to the molecular clock, as can be seen in the following 
quotation (King & Jukes, 1969: 796): “The rate of 
non-Darwinian change equals the rate of selectively 
neutral mutation and is independent of environmental 
fluctuations and of population size. For a given 
protein, the rate of such change should be nearly 
constant. Darwinian change, in contrast, is under the 
influence of changing environment, adaptive radia- 
tion, fluctuations in population size, and such factors 
as adjustment to major changes in the genetic 
background. Thus it might well be subject to bursts 
of rapid change in some species and relative stability 
in others.” 

Following this reasoning, genes (or individual 
nucleotide bases) undergoing selective (Darwinian) 
evolution will be poor candidates for a molecular clock. 
Sites under selective pressure will have heterogeneous 
rates. both vertically within clades and horizontally 
among clades. Such sites will not exhibit clocklike 
behavior in their rate of evolutionary change. 

Some recent papers have suggested that bases at 
codon third positions have more of a clocklike pattern 
of evolutionary rate than first- or second-position 
bases (e.g., Mercer & Roth, 2003). This is intuitive 


if substitutions in third positions are synonymous in 


terms of the proteins coded for, then they would be 
neutral in the sense of the above discussion and might 
he better candidates for modeling a molecular clock. 
The potential for third-position bases to provide 
phylogenetically informative variation has also been 
debated, and Källersjö et al. (1999) found that more 
groups were supported in one large molecular analysis 
by (presumably neutral) third-position bases than by 
first and second positions. This may be due to rate 
issues (because they are neutral and evolve faster, 
variation in third-position bases resolves more 
lerminal groups). At any rate, the issues of saturation 


and the Felsenstein zone (see Felsenstein, 1978) have 


Volume 95, Number 1 
2008 


Nixon 45 


Paleobotany, Evidence, and Molecular Dating 


generally been invoked to suggest that third positions 
are less useful due to repeated substitutions. These 
issues are not meant to be addressed here but 
certainly are necessary to an understanding of how 
molecular dating might perform with different data 
sets and different analyses (e.g., parsimony vs. 
maximum likelihood). 

In a recent review of the subject, Benton and Ayala 
(2003) did not address the primary issue of whether 
evolutionary rates are clocklike or issues related to 
Darwinian versus non-Darwinian evolution. Instead. 
they framed the problem as a conflict between results 
from fossils providing dates that are younger and 
molecular clock methods providing dates that are 
much older. Anecdotally, they provide an example 
where, with additional data and analyses. the two 
disparate estimates come closer together, and they 
present this as a hopeful indication that we are on the 
road to reconciling the problems associated with 
molecular clock dating. 

It is important to note that although some of the 
most popular molecular-dating methods, e.g., NPRS 
(Sanderson, 1997), on the surface may not seem to 
require rate constancy, they do assume local rate 
constancy as opposed to global rate constancy. If rates 
are completely unpredictable (i.e., poorly correlated 
with time) and/or change rapidly and frequently 
within and among clades, even methods such as 
NPRS will fail to provide results. Other issues related 
to the 
molecular dating are discussed elsewhere in this 


issue (Gandolfo et al., 2008). 


selection of suitable fossils to calibrate 


AN EXAMPLE: THE NYMPHAEALES 


In the past two decades, molecular analyses of 
angiosperms have placed various modern groups as 
concordant sister taxa to the bulk of the extant 
angiosperms, ranging from Ceratophyllum L. based on 
rbcL alone (Chase et al., 1993; Nixon, 1999) to various 
permutations with Amborella Baill. as the sole sister 
taxon of all remaining extant angiosperms (e.g., Soltis 
et al., 2000). In these latter trees, the nymphaealean 
clade, along with Illiciales and Austrobaileyales, is 
generally placed as a sister group of all remaining 
angiosperms excluding Amborella. It should be noted 
that statements regarding these trees often suggest 
that Amborella (or Ceratophyllum) is a basal or 
primitive angiosperm, which of course is merely a 
misinterpretation of sister-group relationships (such 
terminology in the context of fossils has more 
meaning: Sun et al., 2002). It is easily understood 
that if the three-gene tree (Soltis et al., 2000) is 
correct, then Amborella and any extant angiosperm 
diverged from the common ancestor of angiosperms at 


the same time, and therefore the lineage culminating 
in Amborella, have had the same amount of time to 
diverge from the ancestral angiosperm as any orchid, 
mistletoe, grass, or Texas bluebell. Amborella is no 
more basal within the angiosperms than any other 
extant species of angiosperm. Because of their aquatic 
nature and presumably early divergence (whether 
termed basal or not), the Nymphaeales has attracted 
considerable attention in the context of the putatively 
primitive angiosperms. Indeed, this interest has 
increased with the discovery that some of the earliest 
identifiable angiosperm fossils appear to be aquatic 
(Sun et al., 2002). 

A recent paper by Yoo et al. (2005) that focuses on 
Nymphaeales provides a useful example of the conflict 
between traditional paleobotanical evidence and the 
results of molecular-dating methods. Yoo et al. (2005) 
used four approaches (strict molecular clock, NPRS, 
PL, Bayesian) in an attempt to calculate divergence 
times for the modern crown group of Nymphaeales as 
well as the age of the angiosperms. They utilized a 
previously published morphological cladogram of 
Nymphaeales (Yoo et al., 2005: fig. 1, from Les et 
al., 1999). This tree was used to map molecular 
changes using sequences for rbcL, matK, and 188 that 
were downloaded from GenBank and aligned using 


CLUSTAL X (Thompson et al., 1997) with the default 


options, and the alignment was then adjusted 
manually (Yoo et al., 2005). This tree has 16 
terminals, including four gymnosperm  outgroups 


(Ginkgo L., Gnetum L., Larix Mill., Taxus L.) 
Amborella. the Illiciales or Austrobaileyales repre- 
sented by three terminals (icium L., Schisandra 
Michx., Austrobaileya C. T. White), and the eight 
commonly recognized genera of Nymphaeales (Victo- 
ria Lindl., Nymphaea L., Euryale Salisb., Ondinea 
Hartog, Barclaya Wall., and Nuphar Sm. from the 
Nymphaeaceae, and Cabomba Aubl. and Brasenia 
Schreb. from the Cabombaceae). 

It is important to note that Yoo et al. (2005) did not 
calculate a new cladogram based on the data matrix 
that they generated from downloaded sequences, but, 
instead, the pre-aligned molecular sequences were 
superimposed on the morphological tree from Les et 
al. (1999). The program r8s (Sanderson, 2003) was 
then used to analyze the molecular matrix using the 
Les et al. (1999) tree as the input tree. It is not clear 
why Yoo et al. (2005) did not attempt a new analysis 
of the molecular data, given that the original data 
matrices needed to be included in order to estimate 
molecular divergence times. They then analyzed the 
matrix/tree combination with various methods of 
molecular dating available in r8s: “[B]ecause all tests 
of rate heterogeneity among lineages were highly 
significant, we used three approaches that have been 


46 


Annals of the 
Missouri Botanical Garden 


proposed for use with heterogeneous rates, NPRS 
(Sanderson, 1997, 1998), PL (Sanderson, 2002), and a 
Bayesian method (Thorne et al., 1998; Kishino et al., 
9001: Thorne & Kishino, 2002)” (Yoo et al., 2005: 695). 

The results of the molecular dating analysis by Yoo 
et al. (2005: 697) suggested a 
Tertiary divergence for the crown group of Nym- 


relatively recent 


phaeales, even though Cretaceous fossils had been 
previously described for the family Nymphaeaceae: 
“Our divergence time estimates indicate that extant 
Nymphaeales diversified relatively recently, whereas 
the stem lineage to Nymphaeales is old, based on a 
fossil attributed to Nymphaeales from the Early 
Cretaceous (125-115 mya; Friis et al., 2001) and a 
fossil attributed to Nymphaeaceae from the middle 
Cretaceous (~90 mya; Gandolfo et al., 2004). In 
other words, the age of diversification found by Yoo et 
al. (2005) for Nymphaeales based on molecular-dating 
methods is considerably younger than the published 
fossil evidence (in contrast to the more common 
situation, where fossils are typically younger than 
estimated ages based on molecular trees). Oddly, Yoo 
et al. (2005: 697) use the rare but similar contradic- 
tory results found in other studies to validate their own 
results: “These results for Nymphaeales indicating 
recent diversification in an ancient lineage agree with 
similar findings for the basal angiosperms Chlor- 
anthaceae (Zhang & Renner, 2003) and HMlicium 
(Illiciaceae; A. Morris, unpublished data). The fossil 
record indicates clearly that Chloranthaceae represent 
one of the oldest angiosperm lineages, with unequiv- 


ocal reproductive structures resembling those of 


Hedyosmum from the Barremian Aptian boundary, 
approximately 125 mya (see Friis et al., 1994, 1999; 
Friis, 1997: Doyle et al., 2003; Eklund et al., 2004, for 
recent interpretations and lists of earlier references). 
However, divergence time estimates based on molec- 
ular data indicate that the extant genera of Chlor- 
anthaceae are relatively young (ie, 60-29 mya for 
Hedyosmum, 22-11 mya for Chloranthus, and 18- 
9 mya for Ascarina; Zhang & Renner, 2003). 

There seems little point in belaboring the faulty 
logic in citing agreement with other, unrelated 
analyses in which fossil ages are older than the age 
of crown groups estimated by molecular models. 
Usually, when molecular dating methods repeatedly 
provide significantly older dates than fossil evidence, 
the argument of an incomplete fossil record is 
typically voiced to explain the discrepancy. In 
contrast, since in their study Yoo et al. (2005) found 
significantly younger diversification of the crown 
group for Nymphaeales than well-studied fossils, they 
were forced to decide that the fossil identifications 
were incorrect. Unfortunately, for this interpretation, 
in the case of the Laurales example, the Chlorantha- 


ceae have a long unequivocal fossil record with some 
fossils that have the same characters as modern 
Hedyosmum Sw. from the early Cretaceous (Eklund et 
al.. 2004). However, molecular clock dating tells us 
this cannot be so; according to the rate model(s). 
Hedyosmum did not diversify until much later. The 
pattern here is apparent—if the dates are too old then 
we just do not have enough fossil data; if the dates are 
too young, then the fossils are misidentified. In other 
words, no independent tests of the molecular dates are 
allowed, and fossils are the only independent tests. 
Apparently. the particular results of the molecular 
dating are not called into question in cases of conflict, 


only the fossil evidence. 


THE AGE OF ANGIOSPERMS 


Apparently in order to provide evidence that both 
nymphaealean fossils from the Cretaceous (Friis et al., 
2001; Gandolfo et al., 2004) were erroneously placed, 
Yoo et al. (2005: 695) then compared calibration of 
the age of the entire angiosperm clade using these two 
fossils (in two independent PL analyses) in r8s. Given 
the previous putative overestimation of the age of 
Nymphaeales. the results were rather shocking: the 
angiosperm clade diverged 1093 million years ago 
(Ma; more than one billion years ago) based on 
Vicrovictoria Gandolfo, Nixon & Crepet (Gandolfo et 
al.. 2004) and 1457.3 Ma (almost 1.5 billion years 
ago; Yoo et al., 2005: table 2) based on the Friis et al. 
(2001) fossil. Again, Yoo et al. (2005) suggested that 
this provides evidence that the fossils are phyloge- 
netically misplaced. As an aside, note the high 
precision for the age of angiosperms based on the 
Friis et al. (2001) fossil calibration (to ca. 300.000 
years of almost 1.5 billion years). 

Yoo et al. (2005: 700) proceed to use the poor 
performance of PL (in the sense of providing what 
might be considered wildly inaccurate estimates of the 
origin of angiosperms) as an indictment of the 
placement of Microvictoria as a crown group Nym- 
phaeaceae fossil: “There seem to be two possible 
explanations for the disparity between the fossil 
record for angiosperms and the age inferred here 
using PL and Microvictoria, as placed within Nym- 
phaeaceae by Gandolfo et al. (2004) as the calibration 
point. Either methods of estimating divergence times 
from molecular data are highly dubious or the 
phylogenetic placement of Microvictoria in Gandolfo 
et al. (2004) may need to be reconsidered.” 

There are more than two possible explanations for 
the problematically old estimates calculated using PL 
for the origin of angiosperms, and the dichotomy of 
Iwo possible explanations presented by Yoo et al. 


(2005) is a false one. First. the fact that in a particular 


Volume 95, Number 1 
2008 


Nixon 47 


Paleobotany, Evidence, and Molecular Dating 


instance a method fails does not make it highly 
dubious, and certainly this is not a reasonable 
conclusion. This is simply a classic case of a straw 
man, clearly intended to steer the reader toward the 
second alternative explanation provided. Maximum 
likelihood, parsimony, and almost any conceivable 
phylogenetic method fail under certain circumstances 
(e.g., long-branch attraction or repulsion, a.k.a. the 
Felsenstein zone or the Farris zone; Siddall, 1998), 
and this fact alone does not impugn either of these 
widely used methods. Indeed, with any method one 
expects failure a certain percentage of the time, 
particularly when the assumptions of the method are 
not met. Certainly, there is a strong possibility that 
evolution is not acting in a sufficiently clocklike 
manner in the region of Nymphaeales relative to other 
parts of the angiosperm tree, and/or the rates in this 
part of the tree are not representative of rates 
elsewhere. Such heterogeneity of rate could result in 
fossils from these clades being poor choices for 
calibration of an entire seed plant cladogram, and this 
is actually the best alternative explanation. In other 
words, it may very well be that rates can differ 
dramatically between ancestor and descendant nodes, 
and using a single fossil from an area of the tree that is 
heterogeneous for rates and not typical of average 
rates elsewhere will likely result in estimates that are 
obviously wrong. Given that Yoo et al. (2005) had 
already established the high significance of rate 
heterogeneity among lineages, it seems odd that they 
did not provide this obvious possible interpretation of 
their results. 

In addition to the third explanation above (assump- 
tions of PL were not actually met due to highly 
heterogeneous rates), there is also the possibility, for 
those who adhere to the primacy of molecular-dating 
methods, that the angiosperms are much older than 
previously thought (interestingly, in the majority of 
results of molecular dating, this is the finding—that 
the age of major groups is much older than previously 
thought—yet, in those cases, the argument is typically 
provided that we merely lack fossils of sufficient age; 
see Graur & Martin, 2004). It seems that when ages 
are acceptably older than expected based on fossil 
evidence alone, molecular dating is working well, and, 
when they are unacceptably older than expected, the 
fossil record (or identification) is bad. 

If one accepts the general validity of molecular- 
dating methods, as well as the identification of the 
nymphaealean fossils by two different paleobotanical 
research groups (Friis et al., 2001; Gandolfo et al., 
2004)—one of which was included in cladistic 
analysis—the “too old” (Yoo et al., 2005: 699) 
angiosperm results might suggest two interpretations. 
First, perhaps there were significant changes in rate in 


the Nymphaeales lineage, e.g., a rapid diversification 
in the Cretaceous, followed by much slower rates 
within the crown groups. Second, a single fossil from a 
particular clade might not be sufficient to provide 
accurate results using molecular-dating methods. 
These interpretations (not considered by Yoo et al., 
2005) do not impugn molecular dating per se or 
traditional paleobotanical observation (which, after 
all, must be the basis for calibration of trees used in 
molecular dating), The idea of highly heterogeneous 
rates, with rapid diversification followed by relative 
stasis, was popularized a few decades ago by Eldredge 
and Gould (1972) when they championed the concept 
of punctuated equilibrium. Perhaps this has fallen out 
of favor because it is difficult to model and, if 
demonstrated to be common, would cause severe 
problems with simple rate-smoothing methods, which 
would tend to over-equalize rates among ancestors, 
descendants, and disparate clades. In this context, 
based on current large-scale phylogenetic trees, it is 
likely that the original angiosperms were not fully 
aquatic (Crepet et al., 2004), and it is possible that 
there was a period of rapid diversification in the 
nymphaealean clade in invasion of the aquatic 
habitat, followed by subsequent relative stasis in a 
stable aquatic habitat. 

Yoo et al. (2005: 700) provide a rather complicated 
explanation of why Microvictoria might have been 
misplaced: “Alternatively, Microvictoria may be 
misplaced in the phylogenetic analysis of Gandolfo 
et al. (2004), perhaps due to homoplasy in the crucial 
morphological characters scored and included in that 
study. That is, a now-extinct assemblage of early 
angiosperms may have possessed suites of traits not 
found in any extant groups.” This speculation needs 
little discussion. There might be any number of clades 
that are extinct and never discovered as fossils that 
any particular existing fossil might be related to. The 
same can be said for any extant taxon as well. There is 
always the possibility of something as yet unobserved 
to counter what careful observation and analysis show. 

As a relative aside, another issue brought forth by 
Yoo et al. (2005) impinges on the effect of fossils with 
ambiguous characters on local parts of a consensus 
tree, which is a well-known phenomenon (Nixon & 
Wheeler, 1992; Nixon, 1996). Yoo et al. (2005: 699) 
apparently misinterpret deresolution due to multiple 
placements of a fossil with actual clade conflict and 
misinterpret the consensus tree presented by Gandolfo 
et al. (2004): “However, Gandolfo et al.’s topology 
disagrees with the morphological analysis of Les et al. 
(1999), who found strong support for Victoria and 
Euryale, consistent with many previous inferences of 
this 


relationship is not evident in Gandolfo et al.’s tree.” 


relationship in Nymphaeaceae; sister-group 


48 


Annals of the 
Missouri Botanical Garden 


Apparently, Yoo et al. (2005) misunderstand poly- 
tomies—an unresolved polytomy is not in topological 


disagreement with any dichotomy that it implies 
well known in systematics theory and the basis for 
kinds of 


consensus trees (see Nixon & Carpenter, 1996, for a 


creation and interpretation of various 
complete discussion of these issues). In fact, the trees 
found by Gandolfo et al. are identical to those 
(1999) when the fossil 


Vicrovictoria is pruned (removed after the analysis). 


reported by Les et al. 
with high support for the Victoria—Euryale clade. The 
fact that Microvictoria floats (Nixon & Wheeler, 1992) 
within this clade does not mean that it reduces 
support for Victoria—Euryale, which always have 
exactly the same relationship to other extant Nym- 
phaeales in all most parsimonious trees (Gandolfo et 
al., 2004). It merely means that given these data, 
Microvictoria is ambiguously placed, sometimes as a 
sister to Victoria, sometimes to both Victoria and 
Euryale. The consensus is thus unresolved in this 
area. If viewing the component most parsimonious 
trees, when only extant taxa are considered, Victoria 
and Euryale always form a monophyletic group of two 
laxa, as in Les et al. (1999) and as stated by Gandolfo 
et al. (2004). Note also that Gandolfo et al. (2004), 
because they restricted their analysis to the original 


Les et al. (1999) matrix, did not include the feature of 


prickles on the outside of the flower and pedicel —a 
synapomorphy of the Victoria—Euryale clade that 


would exclude Microvictoria from that clade—since 


the floral cup of Microvictoria is covered externally 
with bracts, not prickles. 

Other molecular-dating analyses, using much larger 
data sets and different fossils to calibrate their 
analyses, have found results that are in conflict with 
Yoo et al. (2005). Wikström et al. (2001) did an NPRS 
analysis on a much larger and more diverse 
angiosperm data set (567 taxa vs. 16 taxa, also with 
three genes), using putatively fagaceous Cretaceous 
fossils to calibrate one of the 567-taxon three-gene 
trees (Soltis et al., 2000). The results presented by 
Wikström et al. (2001) suggest an age for the 
Nymphaeales of 171-153 Ma, with divergence of the 
extant crown group occurring 144-111 Ma (consistent 
with, and considerably older than the fossil Micro- 
victoria, aged at ca. 90 Ma). Given these results, one 
might suggest that perhaps there is a problem with the 
Yoo et al. (2005) paper, not the identification of 
Vicrovictoria. Wikström et al. (2001; 2211) also note 
the phenomenon of discovering ever-older examples of 


important extant clades of angiosperms: . .. fossils 
considered to be members of derived angiosperm 
lineages are being documented from increasingly 
older geological deposits. Crepet and Nixon (1998), 


for example, documented Clusiaceae from Turonian 


(90-88 Myr) deposits of New Jersey.... The presence 
of such derived groups in Cenomanian—Campanian 
deposits implies either that we have underestimated 
the rapid and explosive nature of the angiosperm 
diversification or that cladogenesis in basal angio- 
sperms took place considerably earlier than fossil- 
based estimates have indicated.” Note that, faced with 
a discrepancy between fossils and molecular dating (a 
very common occurrence), Wikström et al. (2001) 
provide two scientific, reasonable alternative expla- 
and fossils nor the 


nations, neither reject the 


molecular-dating methods. 


CONCLUSIONS 


Homogeneous rates of sequence evolution (a 


molecular clock) seem to be rare at best. In the 
example provided by Yoo et al. (2005), significant 
heterogeneity of rates was detected. This is consistent 
with many other studies and, in fact, is often easily 
seen in cladograms (whether derived by likelihood or 
parsimony) when branch lengths are displayed. 
Methods such as NPRS and PL merely move the 
molecular clock assumption to within-clade calcula— 
tions (by smoothing rates locally) and are susceptible 
to poor calibration—using one or a few fossils from 
clades that are not representative of rates in other 
parts of the tree. If rates are very slow near the 
calibration fossil, one will come up with overestimates 
of ages in other parts of the tree; if rates are very fast, 
one will come up with underestimates. The possibility 
that two different fossils from different parts of a clade 
could provide very different estimates on the same 
trees is not at all surprising. Because of this, if one 
wishes to use molecular-dating methods, one should 
ulilize as many different fossils as possible (assuming 
they are well identified and phylogenetically placed); 
fossils might be used together or in separate analyses 
to compare how disparate the results become. The fact 
that a particular fossil provides different rate 
estimations than other fossils no more impugns the 
identification of the fossil than it does the assumption 
of clocklike 


representative of rates in the tree as a whole. Because 


rates near the fossil node that are 
fossils are based on direct observations and rates 
cannot be observed, one must favor fossil identifica- 
lion over rate assumptions based on a model. 

By smoothing vertically through clades, NPR'S and 
related methods explicitly apply a local molecular 
clock where possible. Rate smoothing, by isolating 
chunks of the tree, is biased against rapid diversifi- 
‘ation followed by slow evolutionary rales—t.e., 
punctuated equilibrium, a popular concept based on 
observation of actual fossils through the record. The 


issue here is not whether Microvictoria is or is not a 


Volume 95, Number 1 
2008 


Nixon 49 
Paleobotany, Evidence, and Molecular Dating 


member of the crown group of Nymphaeaceae/ 
Nymphaeales. The issue goes to the core of how we 
(2005) 


identifications based on careful morphological work 


do science. Yoo et al. rejected fossil 
and cladistic analyses and presented only two 
alternative explanations: that the method is wrong 
(“methods of estimating divergence times from 
molecular data are highly dubious” [Yoo et al.. 
2005: 700), or the angiosperms are unreasonably 
old. These prima facie false premises were then used 
to question the placement of the nymphaealean 
fossils, which, as noted above, were consistent with 
larger, more robust analyses (Wikström et al., 2001). 
Yoo et al. (2005) did not supply an 
alternative morphological analysis of Microvictoria but 


Because 


instead created a scenario that some unspecified 
lineage may have existed with the same characters as 
modern Nymphaeaceae, their conclusion that Micro- 
victoria (and the fossil from Friis et al., 2001) are 
doubtfully part of the Nymphaeaceae crown clade 
cannot be evaluated. The Yoo et al. (2005) paper is 
useful for pointing out what not to do in terms of fossil 
data and molecular dating. Because rates may (and 
do) change radically within lineages, any particular 
fossil may be a poor choice for calibration of an entire 
tree. Calibration with Microvictoria produces one 
answer; calibration with fagaceous fossils on a larger 
(better taxon-sampled) tree produces an entirely 
different result. This is not surprising. 

The conclusion presented here is simple: it is 
dangerous, and verificationist, to accept or reject 
fossils based on whether they fit a preconceived 
notion of the correct results. It is likely that with 
almost any data set, a fossil could be found to 
calibrate a molecular-dating model that would provide 
a wide range of dates for a particular node of interest. 
If the results become the primary consideration in 
deciding which fossils should be used, the scientific 
endeavor is abandoned and the study becomes an 
exercise in verification. Because of these constraints, 
the most casually studied fossil by a competent 
paleobotanist, if placed in phylogenetic context with 
explicit characters, should be favored in every case 
over what might be termed a secondary analysis—and 
rejecting direct observation based on an interpretation 


of history constructed on a model. 


Literature Cited 


Benton, M. J. & F. J. Ayala. 2003. Dating the tree of life. 
Science 300: 1698-1700. 

Brummitt, R. K. 2006. Am I a bony fish? Taxon 55: 268-269. 

Chase, M. W., D. E. Soltis & R. G. Olmstead. et al. 1993. 
Phylogenetics of seed plants: An analysis of nucleotide 
sequences from the plastid gene rbcL. Ann. Missouri Bot. 
Gard. 80: 528-580. 


Crepet, W. L., K. C. Nixon & M. A. Gandolfo. 2004. Fossil 
evidence and phylogeny: The age of major angiosperm 
clades based on mesofossil and macrofossil evidence from 
Cretaceous deposits. Amer. J. Bot. 91: 1666-1682. 

Cronquist, A. 1981. An Integrated System of Classification of 
Flowering Plants. Columbia Univ. Press, New York. 

Doyle, J. X., H. Eklund & P. S. Herendeen. 2003. Floral 
evolution in Chloranthaceae: Implications of a morpho- 
logical phylogenetic analysis. Int. J. Pl. Sci. 164(suppl.): 
8365-8382. 

Eklund, H., J. A. Doyle & P. S. Herendeen. 2004. 
Morphological phylogenetic analysis of living and fossil 
Chloranthaceae. Int. J. Pl. Sci. 165: 107-151. 

Eldredge, N. & S. J. Gould. 1972. Punctuated equilibria: An 
alternative to phyletic gradualism. Pp. 82-115 in T. J. M. 
Schopf (editor), Models in Paleobiology. Freeman Cooper, 
San Francisco. 

Felsenstein, J. 1978. Cases in which parsimony or com- 
patibility methods will be positively misleading. Syst. 
Zool. 27: 401-410. 

Friis, E. M. 1997. Fossil history of magnoliid angiosperms. 
Pp. 121-156 in K. Iwatsuki & P. R. Raven (editors), Evo- 
lution and Diversification of Land Plants. Springer, Tokyo. 

——, K. R. Pedersen & P. R. Crane. 1994, Angiosperm 
floral structures from the Early Cretaceous of Portugal. Pl. 
Syst. Evol. 8: 31-49. 

: & . 1999. Early angiosperm diver- 
sification: The diversity of pollen associated with angio- 
sperm reproductive structures in Early Cretaceous floras 
from Portugal. Ann. Missouri Bot. Gard. 86: 259-296. 


= & 2001. Fossil evidence of water 
lilies (Nymphaeales) in the Early Cretaceous. Nature 410: 
357-300. 

Gandolfo, M. A., K. C. Nixon & W. L. Crepet. 2004. 
Cretaceous flowers of Nymphaeaceae and implications 
for complex insect entrapment pollination mechanisms in 
early angiosperms. Proc. Natl. Acad. Sci. U.S.A. 101: 
8056-8060. 


& 2008. Selection of fossils for 
calibration of molecular dating models. Ann. Missouri Bot. 
Gard 95: 34—42. 

Graur, D. & W. Martin. 2004. Reading the entrails of 
chickens: Molecular timescales of evolution and the 
illusion of precision. Trends Genet. 20: 80-86. 

Hennig, W. 1966. Phylogenetic Systematics. Translated by 
D. Dwight & R. Zangerl. Univ. of Illinois Press, Urbana. 

Hörandl. E. 2007. Neglecting evolution is bad taxonomy. 
Taxon 56: 1—5. 

Kallersjé, M., V. A. Albert & J. S. Farris. 1999, Homoplasy 
increases phylogenetic structure. Cladistics 15: 91-93. 
King, J. L. & T. H. Jukes. 1969. Non-Darwinian evolution. 

Science 164: 788-798. 

Kishino, H., J. L. Thorne & W. J. Bruno. 2001. Performance 
of a divergence time estimation method under a 
probabilistic model of rate evolution. Molec. Biol. Evol. 
18: 352-361. 

Les, D. H., E. L. Schneider, D. J. Padgett, M. Zanis, D. E. 
Soltis & P. S. Soltis. 1999. Phylogeny, classification, and 
floral evolution of water lilies (Nymphaeales): A synthesis 
of non-molecular, rbcL, matK, and 185 rDNA data. Syst. 
Bot. 24: 2846. 

Mercer, J. M. & V. L. Roth. 2003. The effects of Cenozoic 
global change on squirrel phylogeny. Science 299: 
1568-1572. 

Nixon, K. C. 1996. Paleobotany in cladistics and cladistics in 
paleobotany: Enlightenment and uncertainty. Rev. Palaeo- 
bot. Palynol. 90: 361-373. 


50 


. 1999, The parsimony ratchet, a new method for 

rapid parsimony analysis. Cladistics 15: 407-414. 

& J. M. Carpenter. 1996. On consensus, collaps- 

ibility and clade concordance. Cladistics 12: 305-321. 

& Q. D. Wheeler. 1992. Extinction and the origin of 
species. Pp. 119-143 in M. J. Novacek & O. D. Wheeler 
(editors), Extinction and Phylogeny. Columbia Univ. Press, 
New York. 

Saitou, N. & M. Nei. 1987. The neighbor-joining method: A 
new method for reconstructing phylogenetic trees. Molec. 
Biol. Evol. 4: 406—425. 

M. J. 1997. A nonparametric approach to 

estimating divergence times in the absence of rate 

constancy. Molec. Biol. Evol. 14: 1218 1232. 

1998. Estimating rate and time in molecular 

phylogenies: Beyond the molecular clock? Pp. 242-264 

in D. E. Soltis, P. S. Soltis & J. J. Doyle (editors), 

Molecular Systematics of Plants II. DNA Sequencing. 

Kluwer, Boston. 

2002. Estimating absolute rates of molecular 

evolution and divergence times: A penalized likelihood 

approach. Molec. Biol. Evol. 19: 101-109. 

2003. r8s: Inferring absolute rates of molecular 

evolution and divergence times in the absence of a 

molecular clock. Bioinformatics 19: 301—302. 

„ J. L. Thorne, N. Wilk & K. C. Bremer. 2004. 
Molecular evidence on plant divergence times. Amer. J. 
Bot. 91: 1656-1665. 

Siddall, M. E. 1998. Success of parsimony in the four-taxon 
case: Long-branch repulsion by likelihood in the Farris 
zone, Cladistics 14: 209-220. 

Soltis, D. E., P. S. Soltis, M. W. Chase, M. E. Mort, D. C. 
Albach. M. Zanis, V. Savolainen, W. J. Hahn, S. B. Hoot, 
M. F. Fay, M. Axtell, S. M. Swensen, L. M. Prince, W. J. 


Sanderson, 


Annals of the 
Missouri Botanical Garden 


Kress, K. C. Nixon & J. S. Farris. 2000, Angiosperm 
phylogeny inferred from 188 rDNA, rbcL, and atpB 
sequences. Bot. J. Linn. Soc. 133: 381-461. 

Sun, G., O. Ji, D. L. Dilcher, S. L. Zheng, K. C. Nixon & X. F. 
Wang. 2002. Archaefructaceae, a new basal angiosperm 
family. Science 296: 899-904. 

Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin & 
D. G. Higgins. 1997. The ClustalX windows interface: 
Flexible strategies for multiple sequence alignment aided 
by quality analysis tools. Nucl. Acids Res. 24: 4876-4882. 

Thorne, J. L. & H. Kishino. 2002. Divergence time and 
evolutionary rate estimation with multilocus data. Syst. 
Biol. 51: 689-702. 

—. & 1. S. Painter, 1998. Estimating the rate of 
evolution of the rate of molecular evolution. Molec. Biol. 
Evol. 15: 1647-1657. 

Wikstrom, N., & M. W. 2001. 
Evolution of the angiosperms: Calibrating the family tree. 
Proc. Roy. Soc. London, Ser. B, Biol. Sei. 268: 221 1-2220. 

Yoo, M.-J., C. D. Bell, P. S. Soltis & D. E. Soltis. 2005. 
Divergence times and historical biogeography of Nym— 
phaeales. Syst. Bot. 30: 693-704. 

Zhang, L.-B. & S. Renner. 2003. The deepest splits in 
Chloranthaceae as resolved by chloroplast sequences. Int. 
J. PL. Sci. 164(5 suppl.): 5383-5392. 

Zuckerkandl, E. & I. Pauling. 1962. Molecular disease, 
evolution, and genetic heterogeneity. Pp. 189-225 in M. 
Kasha & B. Pullman (editors), Horizons in Biochemistry. 
Academic Press, New York. 

& . 1965, Evolutionary divergence and 
convergence in proteins. Pp. 97-166 in V. Bryson & H. 


V. Savolainen Chase. 


J. Vogel (editors), Evolving Genes and Proteins. Academic 
Press, New York. 


